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i_.  Intorduction 

Much  work  has  been  done  on  the  parallel  evaluation  of  arith¬ 
metic  expressions.  Some  references  are  Cl]/  £23#  [3],  [143. 
and  [113  •  Brent  [1J,  for  instance,  has  shown  that  arith¬ 
metic  expression  containing  n,  n  ^  1,  operands;  operators 
(+,  *,  and  /);  and  parenthesis  can  be  evaluated  in 

41og_n+10  (n-i ) /p  time  when  p  processors  are  available. 
Unfortunately,  little  work  seems  to  have  been  done  on  the 
parallel  generation  of  executable  code  for  arithmetic 
expressions.  Fischer  [5]  considers  the  parsing  of  expres¬ 
sions  on  a  vector  (pipelined)  machine.  No  work  has  been 
done  on  the  parsing  of  expressions  on  parallel  multiproces¬ 
sor  machines.  In  this  paper.  we  address  this  problem. 
Specifically,  we  study  the  following  problems: 

(1)  parallel  generation  of  the  postfix  form 

(2)  parallel  generation  of  the  binary  tree  form 

In  both  cases,  we  start  with  the  infix  form  of  the 
expression.  Further,  we  assume  that  the  input  infix  expres¬ 
sion  is  syntactically  correct.  The  reader  unfamiliar  with 
the  postfix  and  tree  forms  of  an  expression  is  referred  to 
Horowitz  and  Sahni  C6]. 

The  study  of  the  two  problems  cited  aoove  is  motivated 
by  the  following  considerations: 

(1)  We  could  conceivably  build  a  special  purpose  infix-to- 
postfix  chip  that  could  be  used  like  a  peripheral  on  a 
very  high  speed  number  cruncher.  The  use  of  this 
parallel  translator  chip  would  speed  compilation  of 
programs . 

(2)  Most  code  optimizers  for  single  processor  machines 
start  with  the  tree  form  of  an  expression.  Hence  a 
high  speed  special  purpose  chip  that  performs  the 
translation  from  infix  to  tree  form  could  be  used  in 
the  context  of  (1). 

(J)  If  the  program  is  to  be  executed  on  a  parallel  machine, 
it  can  also  be  compiled  on  that  machine  using  a  paral¬ 
lel  compiler.  Such  a  compiler  will  need  to  be  able  to 
translate  in  parallel,  from  the  infix  form  to  a  more 
usable  form.  The  postfix  and  tree  forms  are  two  such 
forms.  In  fact,  the  parallel  evaluation  methods  sug¬ 
gested  in  [1],  C 2 J ,  [3] ,  [10],  and  [11]  all  begin  with 
the  tree  form  of  the  arithmetic  expression. 

(4)  While  the  length  of  individual  arithmetic  expressions 
in  typical  programs  is  small  (Knuth  [7]),  Kuck  [9]  has 
shown  that  optimizing  compilers  for  parallel  machines 
can  generate  very  long  expressions  even  when  the  input 
program  contains  only  short  expressions.  Furthermore, 
it  is  possible  to  view  the  entire  program  as  a  single 
expression  and  obtain  its  postfix  form. 


The  model  of  parallel  computation  that  we  shall  use 
here  is  commonly  referred  to  as  the  shared  memory  model 
(SMM).  This  has  the  following  characteristics: 

(1)  There  are  p  processing  elements  (PEs)  or  processors. 
These  are  indexed  13,1,..., p-1  and  an  individual  PE  may 
be  referenced  as  in  PE(i).  Each  PE  is  capable  of  per¬ 
forming  the  standard  arithmetic  and  logical  operations. 
In  addition,  each  PE  knows  its  index. 

(2)  There  is  a  common  memory  that  is  shared  by  all  the  PEs. 
All  p  PEs  can  read  and  write  into  this  memory  in  the 
same  time  instance.  If  two  PEs  attempt  to  read  the 
same  word  of  memory  simultaneously,  a  read  conflict 
occurs.  Similarly,  if  two  PEs  attempt  to  simultane¬ 
ously  write  into  the  same  word  of  memory,  a  write  con¬ 
flict  occurs.  In  this  paper,  we  assume  that  read  and 
write  conflicts  are  prohibited. 

(3)  The  PEs  are  synchronized  and  operate  under  the  control 
of  a  single  instruction  stream. 

(4)  An  enable/disable  mask  can  be  used  to  select  a  subset 
of  the  PEs  that  are  to  perform  an  instruction.  Only 
the  enabled  PEs  will  perform  the  instruction.  The 
remaining  PEs  will  be  idle.  All  enabled  PEs  execute 
the  same  instruction.  The  set  of  enabled  PEs  can 
change  from  instruction  to  instruction  . 

Much  work  has  been  done  on  the  design  of  parallel  algo¬ 
rithms  using  the  SMM.  The  reader  is  referred  to  [3],  [4]  and 
the  references  contained  therein. 

While  one  can  talk  of  obtaining  the  postfix  and  tree 
forms  for  an  entire  program,  we  shall  limit  our  disucssion 
here  to  simple  expressions.  These  are  permitted  to  contain 
only  operands  (constants  and  simple  variables),  operators 
(only  the  binary  operators  +,  -,  *,  /,  and  t  are  permitted), 
and  delimiters  ( ' ( 1 ,  and  ' ) ' ) . 

Like  every  other  parallel  algorithm,  our  algorithms  are 
based  on  a  sequential  algorithm.  The  sequential  infix  to 
postfix  algorithm  we  start  from  is  that  given  by  Horowitz 
and  Sahni  L6J.  This  algorithm  utilizes  a  stack  as  well  as  a 
dual  priority  system.  The  instack  priority  (ISP)  of  an 
operator  or  delimiter  is  the  priority  associated  with  the 
operator  or  delimiter  when  it  is  inside  the  stack.  The 
incoming  priority  (ICP)  is  used  when  the  operator  or  delim¬ 
iter  is  outside  the  stack.  For  the  operator  and  delimiter 
set  we  are  limited  to, the  priority  assignment  of  Figure  1  is 
adequate . 

The  infix  to  postfix  algorithm  of  [6]  is  reproduced  in 
Figure  2.  This  algorithm  assumes  that  the  infix  expression 
is  in  E(l:n)  where  E(i)  is  an  operator,  operand,  or  delim¬ 
iter,  1  <  i  ^  n  (in  practice,  E(i)  will  be  a  pointer  into  a 
symbol  table).  For  example,  the  expression  A+B*C  is  input 


( 


-  4  - 


as  E(1)=A,  E( 2 )=+,  E  ( 3 ) =Q ,  E(4)=*,  and  E(5)=C.  The  postfix 
form  is  output  in  P(l:m),  m  _<  n.  For  our  example,  we  shall 
have  P ( 1 ) = A ,  P(2)=B,  P(3)=C,  P(4)=*,  and  P(5)=+.  The  time 
complexity  of  procedure  POSTFIX  is  0(n). 


operator/ delimi ter 


i , unary+ , unary- 

*,/ 

binaryt,  - 

( 


line 


Figure  Is  Instack  and  incoming  priorities. 


:  procedure  POSTFIX(E, P,n,m) 

//Translate  the  infix  expression  E(l:n)  into  postfix// 
//form-  The  postfix  form  is  output  in  P(l:m)// 

/ / 1 -oo '  is  used  as  bottom  of  stack  character  and  has// 
/ / ISP=0/ / 

declare  n,  E(lsn),  P(l:m),  top,  STACK(),  i,m 
STACK ( 1 )  <—  '-oo',  top  <—  1  //initialize  STACK// 

m  <—  0 

for  i  «—  1  to  n  do 
case 

sE(i)  is  an  operand:  m  «—  m+1;  P(m)  «—  E(i); 

:E( i )= 1 ) 1 ; while  STACK(top)  ^  ’ ( 1  do 
f J unstack  until  '('// 

m  <r—  m+l;P(m)  <—  STACK(  top)  ;  top  <—  top-1 
endwhile 
top  < —  top-1 

seises  while  ISP( STACK ( top)  _>  ICP(E(i))  do 

m  <—  m+l;P(m)  <— STACK ( top)  ;  top  <— top- 1 
endwnile 

top  <—  top+1,  STACK ( top)  <—  E(i) 

endcase 

endfor 

while  top  >  1  do  //empty  stack// 

m  < —  m+1;  P(m)  <—  STACK(top);  top  <—  top-1 
endwhile 
end  POSTFIX 


Figure  <2  Sequential  infix  to  postfix  algorithm 


While  it  is  often  difficult  to  parallelize  algorithms 
that  utilize  a  stack,  in  Section  2  we  shall  see  that  the 
algorithm  of  Figure  2  can  in  fact  be  effectivily  parallel¬ 
ized.  In  Section  3,  we  shall  see  how  the  tree  form  of  an 
infix  expression  may  be  obtained  in  parallel. 
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2. 


Parallel  Generation  of  the  Postfix  Form 


Let  the  infix  expression  be  given  in  E(l:n)  as  described  in 
Section  1.  We  make  the  added  assumption  that  E  does  not 
contain  superfluous  parenthesis  pairs.  So,  the  forms  ((A)), 
(((A))),  ((A+B))  are  not  permitted.  Our  strategy  to  deter¬ 
mine  the  postfix  form,  in  parallel,  is  to  determine  for  each 
i,  a  value  AFTER(i)  such  that  E(i)  comes  just  after 
E ( AFTER( i ) ) ,  H  i  £  n  in  the  postfix  form.  The  postfix 
form  of  the  expression  A+B*C  is  ABC*+.  Since 
E(1:5)=(A,+,B,*,C),  AFTER (1:5)=(-,4,1,5,3) .  Note  that  B 
comes  just  after  E( AFTER{ 3 ) )=E( 1 ) =A;  *  comes  just  after 
E ( AFTER ( 4 ) ) =£ ( 5 ) =C ;  etc.  Since  the  first  token  (a  token  is 
either  an  operator  or  an  operand  or  a  delimiter)  in  postfix 
form  has  no  predecessor,  its  AFTER ( )  value  is  undefined. 
For  convenience,  we  define  AFT£R()=0  for  the  token  that  is 
to  come  first  in  the  postfix  form.  So,  for  the  above  exam¬ 
ple,  AFTER (1:5)=(0,4,1,5,3)  . 

In  order  co  determine  AF'fER(lsn),  we  need  to  first  com¬ 
pute  the  level  L(i)  of  each  token  in  the  expression.  Infor¬ 
mally,  the  level  of  a  token  gives  the  depth  of  nesting  of 
parenthesis  in  which  this  token  is  contained.  So,  if  a  token 
is  not  within  any  parenthesis,  its  level  is  0.  More  for¬ 
mally,  the  level,  t>,  is  defined  by  the  algoritnm  of  Figure 
3. 


1 

if  E( i )= ' ( ’ 

step 

Is 

G  ( i ) 

«-  1  -1 

1 

if  E( i )= ‘ ) ’ 

,  1  <_i_<  n 

0 

otherwise 

step 

2s 

L(i) 

i 

«-  l  G  (  j  ) 

j=l 

,  1  i  _<  n 

step 

3: 

L(  i) 

L( i ) +1 

if  E( i)= ‘ ) 1 , 

1  <  i  <  n 

Figure  3 

Computation 

of  L . 

In  Figure  4,  we  give  an  example  arithmetic  expression 
together  witn  the  L( )  values  associated  with  each  token  (row 
J). 


Let  us  sequence  through  procedure  POSTFIX  (Figure  2)  as 
it  works  on  the  example  expression  of  Figure  4.  When  i=i, 
£( 1 )— ' ( '  and  '('  gets  put  onto  the  stack.  Next,  i=2,  and 
£{2)=A  is  placed  into  the  postfix  form.  When  i=5,  the  post¬ 
fix  form  has  P(1:2)*(A,B)  and  the  stack  has  the  form  -oo,  (, 
*.  During  chis  iteration,  *  is  unstacked  (as  ISP(*)  _> 
ICP(£(5))).  We  shall  say  that  E(3)  gets  unstacked  by  E(5). 
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E(5)  gets  added  to  the  stack  and  on  the  next  iteration, 
£(6)=C  is  placed  in  the  postfix  form.  When  i=18,  the  stack 
has  the  form  -oo,  +  ,  t,  (,  -,  *,  T,  T  and 
P( 1 : 9 )= ( A, B, * , C, D, E, F, 3, H) .  During  this  iteration,  E(16)=t, 
E(14)=t,  E(12)=*,  and  E(ld)=-  get  unstacked  (in  that  order), 
i.e.,  E(16),  E(14),  E(i2),  and  E(ld)  get  unstacked  by  E(ld). 
Furthermore,  E(10)  is  the  last  operator  to  get  unstacked  by 
E( 18 ) • 
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Figure  4 


For  each  i  such  that  E(i)  is  an  operator,  we  may  define 
U(i)  to  be  the  index  in  E  of  the  operator  or  delimiter  that 
causes  E(i)  to  get  unstacked.  In  case  E(i)  gets  unstacked 
during  the  while  loop  of  lines  17-19  of  procedure  POSTFIX, 
tnen  U(i)  =  n+1.  For  our  example,  U(3)  =  5,  U(10)  =  U(12)  = 
U(14)  =  U(16)  =  18.  Also,  for  each  i  such  that  E(i)  is 
either  an  operator  or  a  right  parenthesis,  we  may  define 
LU(i)  to  be  the  index  of  the  last  operator  that  gets 
unstacked  by  E(i).  If  no  operator  is  unstacked  by  E(i), 
then  LU(i)  is  set  to  d .  For  our  example,  LU(3)=d,  LU(5)=3, 
LU  (  7  )=LU  (lk)  )=LU  ( 12  )  =  LU ( 1 4 )  =LU  (16)  =0 ,  and  LU(18)=ld. 

Continuing  with  our  example,  we  see  that  when  i=19, 
P( 1 : 13 )=( A, B, * , C, D, E, F, G, H, T , T , ,  and  the  stack  has  the 
form  -oo,+,t.  At  this  time,  E(7)=t  is  unstacked  and  E(19)=* 
is  stacked.  So,  LU(19)=7  and  U(7)=19.  Rows  6  and  7  of  Fig¬ 
ure  4  give  tne  U,  and  LU  values  for  all  the  operators  and 
delimiters  of  our  example.  Wote  that  U  is  defined  only  for 
operators  and  LU  only  for  operators  and  right  parenthesis. 

An  examination  of  procedure  POSTFIX  and  our  definition 
of  the  level  L  of  a  token,  reveals  that  if  E(i)  is  an 
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operator,  then: 

U(i)  =  least  j,  j  >  i  such  that  ISP(E(i))  ICP(£(j)) 

and  L(i)=L(j).  If  there  is  no  j  satisfying  this 
requirement,  then  U(i)=n+1. 

Prom  the  definition  of  U,  it  follows  that  if  E(i)  is  an 
operator  or  a  right  parenthesis,  then  LU(i)  is  given  by: 

LU  (  i )  =  least  j,  j  <  i  such  that  U(j)=i.  If  there  is  no 
j  with  U(j)=i,  then  LU(i)=0. 

Prom  U  and  l,U,  AFTER  may  be  determined  as  below. 
case  JL :  E(i)  is  an  operand. 

in  this  case,  we  determine  the  largest  j,  j  <  i  such 
that  E(j)  is  either  an  operand  or  LU ( j )  is  defined  and 
greater  than  0  (note  that  as  extraneous  parenthesis  pairs 
are  not  permitted,  if  E(j)=')'  then  LU ( j )  >  0).  Such  a  j 
does  not  exist  iff  K(i)  is  the  first  operand  in  tne  expres¬ 
sion.  From  procedure  POSTFIX  and  our  definition  of  LU,  it 
follows  tnat 

j  0  if  no  j  as  above  exists 

AFTSR(i)  =  I  j  if  E(j)  is  an  operand 

j  LU { j )  otherwise 


case  2:  E(i)  is  an  operator. 

In  this  case,  we  see  that  if  there  exists  a  j  such  that 
j  >  i  and  U(j)=U(i),  then  AFTER(i)  is  the  smallest  j  with 
this  property.  So,  in  our  example  expression,  U(10)  =  U(12) 
=  U ( 14 )  =  U ( 16 )  =  Id.  Also,  in  P,  E(10)  comes  immediately 
after  E(12)  which  comes  immediately  after  £(14).  S(14) 
comes  immediately  after  £(16). 

For  E(16),  however,  there  is  no  j,  j  >  16  and  U(j)  = 
U ( 1 6 )  .  For  operators  with  this  property,  there  are  two  pos¬ 
sibilities:  either  U(i)-1  is  an  operand  or  U(i)-1  is  a  right 
parenthesis.  If  U(i)-1  is  an  operand,  then  E(U(i)-l)  is  the 
token  placed  in  P  just  before  the  unstacking  caused  by  E(i) 
oegins.  Hence,  AFTER(j)  =  U(i)-1.  If  E(U(i)-l)  is  a  right 
parenthesis,  then  this  right  parenthesis  would  have  caused 
at  least  one  operator  to  get  unstacked  (by  assumption, 
extraneous  parenthesis  pairs  are  not  permitted).  Hence, 
l.U  ( U (  i  )  -1 )  ^  0  and  E( LU (U (  i  ) -1 )  )  is  the  operator  that 
immediately  precedes  E(i)  in  PE.  do,  we  get: 

j  «—  least  j,  j  >  i  and  U(j)=U(i) 


a 


AFTER(i) 


i 


U(i)-1 

LU(U(i)-l) 

j 


if  j  is  undefined  and 
K(U(i)-l)is  an  operand 
if  j  is  undefined 
and  E(U(i)-l)=‘ ) * 
if  j  is  defined 


Row  8  of  Figure  4  gives  tne  AFTER  values  for  all  the 
operators  and  operands  in  our  example  expression.  Tne  AFTER 
values  link  the  E(i)s  in  the  order  they  should  appear  in  the 
postfix  form.  This  linked  list  is  shown  explicitly  in  Fig¬ 
ure  5.  From  this  linked  list,  we  wish  to  determine  the 
position  of  each  operator  and  operand  in  the  postfix  form. 
For  one  of  the  operands,  i.e.,  the  one  with  AFTER( i ) =0 ,  this 
position  is  already  known  (it  goes  into  P(l)).  With  each 
E(i),  let  us  associate  a  one  bit  field  K(i).  K(i)=0  iff  the 
position  of  E(i)  in  P(i)  has  not  been  determined.  ini¬ 
tially,  K(i)=0  for  all  but  one  of  the  tokens  (i.e.  the  one 
with  AFTER ( i ) =0 ) . 

One  may  verify  that  the  algorithm  of  Figure  6,  changes 
all  the  AFTER  values  so  that  on  termination  E(i)  is  to 
occupy  position  AFTER(i)  in  P  (if  E(i)  is  a  delimiter  then 
AFTER(i)  is  undefined  and  E(i)  does  not  appear  in  the  post¬ 
fix  form) . 


Figure  5^ 


To  get  a  feel  for  how  the  algorithm  of  Figure  6  works, 
consider  the  linxed  list  of  Figure  7(a).  This  has  8  nodes 
in  it.  AFTER( )  is  shown  by  an  arrow  or  link.  Initially, 
the  first  (leftmost)  node  has  K()=l;  tne  remaining  nodes 
have  K()=0.  The  K( )  values  are  shown  outside  (and  below) 
the  nodes.  Node  indices  are  shown  outside  and  above  the 
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step  i_  //initialize// 
case 

:AFTER(i)  is  undefined:  K(i)  <r—  undefined 
:AFTER(i)=0  :  K(i)  <—  1;  AFTER(i)  <—  i 
:eise:  K(i)  «—  0 
end  case 


s  tep  _2 


//update  AFTER// 
for  v  < —  i  to  r 
if  K ( i ) =0  then 


endi  f 
endfor 


log  n  ~1  do 
j  <—  AFTER ( i ) 

AFTER(i)  <—  AFTER ( j  ) 
if  K(j)  =  l  then  K(i)  < —  1 , 
AFTER(i)  <—  AFTER(i)+2 
endif 


Figure  6  Algorithm  to  update  AFTER 

node.  In  the  first  iteration  of  step  2,  the  linked  list 
splits  into  two  as  shown  in  Figure  7(d)  and  AFTER ( 2 )  is 
updated  to  2.  This  agrees  with  the  fact  tnat  node  2  is  the 
second  node  (from  the  left)  in  Figure  7(a).  In  the  next 
iteration,  each  of  the  two  lists  of  Figure  7(d)  split  into 
two  and  AFTER(4 )  is  set  to  3  and  AFT£R( 1 )  is  set  to  4. 
Again,  we  see  that  nodes  4  and  1  are  respectively  the  third 
and  fourth  nodes  in  Figure  7(a).  In  the  last  iteration  the 
four  lists  of  Figure  7(c)  split  into  2  eacn  giving  the  con¬ 
figuration  of  Figure  7(d).  All  the  AFTER( )  values  now  give 
the  position  of  the  respective  node  in  the  original  linked 
list . 


The  correctness  of  the  updating  algorithm  of  Figure  6 
may  be  established  formally  by  providing  a  proof  by  induc¬ 
tion  on  the  length  of  the  initial  linked  list.  We  omit  this 
proof  here. 

Once  the  AFTER  values  have  been  updated  as  described 
above,  the  postfix  form  P  is  obtained  by  executing  the  fol¬ 
lowing  instruction: 

if  AFTER(i)  is  defined  then  P(AFTER(i))  <—  E(i) 


Complexity  Analy isis 


First,  let  us  consider  tne  computation  of  the  levels  L 
(Figure  3).  Step  1  can  be  done  in  0(1)  time  using  n  PEs 
(each  PE  is  assigned  to  compute  a  different  G(i)).  It  can 
also  be  done  in  0(log  n)  time  using  n/log  n  PEs  (each  PE 
sequentially  computes  log  n  of  the  G()s).  The  L(i)s  of  step 
2  may  be  computed  in  0(log  n)  time  using  n/log  n  PEs  and  the 
partial  sums  algorithm  of  C4J.  Finally,  step  3  can  be 


(c) 


(d) 


Figure  1_ 

preformed  in  O(loy  n)  time  using  n/ log  n  PEs.  Hence,  the 
levels  L()  may  be  obtained  m  O(log  n)  time  using  n/ log  n 
PEs . 

Next,  consider  the  computation  of  U  and  L.U .  One  possi¬ 
bility  is  to  use  mp  PEs  to  first  make  m  copies  of  each  of 
the  p  operators  and  right  parenthesis  in  E  (m  is  the  number 
of  operators  in  E) .  This  takes  O(log  m)  time.  Each  opera¬ 
tor  now  has  a  copy  of  the  operators  and  right  parenthesis  in 
E  for  itself.  Each  operator  E(i)  is  assigned  p  PEs  to  work 
with.  These  are  first  used  to  eliminate  operators  and  right 
parenthesis  E(j)  witn  j  <_  i.  Next,  the  level  and  ISP  of 
E(i)  is  transmitted  to  the  remaining  operators  and  right 


parenthesis.  This  takes  O(log  p)  time  with  p  PEs .  Opera¬ 
tors  and  right  parentnesis  with  a  different  level  number  or 
with  1GP  >  ISP  (E(i))  are  eliminated.  Tne  operators  and 
right  parenthesis  not  yet  eliminated  are  candidates  for 
U(i).  The  one  with  least  j  can  be  determined  in  O(log  p) 
time  using  a  binary  tree  comparison  scheme  and  p  PEs.  If 
there  are  no  candidates,  U(i)=n+1.  LU  may  now  be  determined 
in  a  similar  manner.  This  strategy  to  compute  U  and  LU 
takes  0(n  ;  PEs  and  O(log  n)  time.  Using  the  techniques  of 
[ 4 J »  it  can  be  made  to  run  in  O(log  n)  time  using  only 
o(n/ log  n)  PEs. 

An  alternative  stategy  is  to  first  collect  togetner  all 
operators  and  right  parenthesi^  that  have  the  same  level 
number.  This  can  be  done  in  d(log  n)  time  using  n  PEs  as 
follows.  First,  each  left  parenthesis  determines  the  posi¬ 
tion  of  its  matching  right  parenthesis.  This  is  done  by 
simply  sorting  the  left  and  right  parenthesis  by  their  level 
numoer.  If  a  stable  sort  is  used,  each  left  parenthesis 
wilL  be  adjacent  to  its  matching  right  parenthesis  following 
the  sort  (Figure  8).  The  sort  can  be  accomplished  in 
O(log  n)  time  using  n  PEs  Ll3j.  Now,  each  left  parenthesis 
can  determine  tne  address,  M(i),  of  its  matching  right 
parenthesis . 


L  12  22  223321 

(  (  )  (  )  (  (  )  )  ) 

POSITION  ab  cd  efgh  ij 

(a)  before  sort 


1122222233 

()()()()() 

ajbcdefigh 

(a)  after  sort 


Figure  8 


Once  M(i)  has  been  determined  for  each  left  parenthesis 
E(i),  we  can  link  together  all  operators  and  right 
parentnesis  with  the  same  level  as  needed  in  the  computation 
of  U.  There  are  only  two  possibilities  for  any  operator  i. 
These  are: 
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(a)  t£(i+l)  =  '(‘:  In  this  case,  E(i)  is  linked  to  M(i+1)+1. 

(b)  £{i+l)  In  this  case  i+2=n+l  or  E(i+2)  is  an 

operator.  Regardless,  E(i)  is  linked  to  i+2. 


Performing  this  linkage  operation  on  the  example  of 
Figure  4  gives  the  linked  lists  of  Figure  9.  Now,  each 
linked  list  can  be  treated  independently.  For  operators 
with  the  highest  ISP  (i.e.,  t)»  the  U  value  is  obtained  by 
collapsing  together  consecutive  chains  of  t  so  that  all  T 
point  to  the  nearest  non  t.  The  U  value  equals  the  link 
value.  So,  U ( 7 )  =  19,  U{14)  =  U(16)  =  18.  For  operators 
with  the  next  highest  ISP,  the  U  values  are  obtained  by 
removing  all  nodes  representing  the  operator  t.  The  link 
values  give  the  U  value.  Doing  this  on  tne  lists  of  Figure 
9,  yields  the  lists  of  Figure  li3.  So,  U(3)=5,  U(19)=21, 
U( 12 )=18, 


3 

5 

7 

19 

21 

31 

33 

1  * 

t 

-S 

1  J 

f71 

-  s 

) 

LI 

1  1 

LU 

10 

12 

14 

16 

18 

I  . 

t 

-  S 

S 

) 

LL_ 

■■ 

T 

25 

IT 


27 

4T 


Figure  9 

U ( 28 )=J0 .  Now,  by  eliminating  all  nodes  tnat  represent  * 
and  /  and  collapsing  the  lists  we  can  determine  the  U  value 
for  the  next  ISP  class.  We  obtain  U(5)=2i,  U(2i)=J2, 

0(10)^13,  and  U(25)=27.  Each  elimination  and  collapsing 
operation  above  can  be  performed  in  O(log  n)  time  using  n 
PEs  and  the  strategy  used  in  Figure  S  to  update  AFTER. 
Since  the  number  of  ISP  classes  is  a  constant,  the  time 
needed  to  determine  U  is  O(log  n) . 

It  should  be  evident  that  LU  can  be  computed  during  the 
computation  of  U.  Each  operator  and  right  parenthesis  keeps 
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10  12  18 


25  27 


28  30 


Figure  lfl 

truck,  of  the  farthest  operator  it  unstacks  from  each  ISP 
class.  The  initial  values  of  AFTER ( )  may  no*  be  computed. 
First,  each  operand  determines  the  nearest  (on  its  left) 
cinary  operator,  right  parenthesis,  and  operand.  These  are 
shown  in  Figure  11  for  our  example  of  Figure  4.  Zeroes 
indicate  the  absence  of  a  nearest  quantity  on  the  left. 
These  three  sets  of  nearest  value  can  be  determined  in  0(log 
n)  time  using  n  PEs.  For  example,  to  get  the  nearest 
operands,  we  eliminate  all  E(i)s  that  are  not  an  operand. 
The  remaining  E(i)s  are  concentrated  to  the  left.  This 
enables  each  operand  to  determine  its  nearest  left  operand. 
Next,  the  operands  are  distributed  back  to  their  original 
spots  (see  Ci2]  for  an  O(log  n)  distribution  algorithm). 

If  E(i)  is  an  operand  and  has  no  nearest  operand  on  the 
Left,  AFTER( i )=d .  If  the  nearest  binary  operator  (on  the 
left)  has  LU()  >  A,  then  AFTER(i)  equals  this  LU  value.  If 
E(i)  has  a  nearest  right  parenthesis  (on  the  left)  then 
AFTER(i)  i3  the  LU  value  of  this  parenthesis.  Otherwise, 
AFTER(i)  is  the  location  of  the  nearest  operand  on  the  left. 

If  E(i)  is  an  operator,  we  can  determine  the  smallest 
j,  j  >  i  such  that  U(j)=U(i)  during  the  computation  of  U  and 
LU .  So,  if  such  a  j  exists,  AFTER  has  already  been  com¬ 
puted.  If  no  such  j  exists,  AFTER(i)  is  to  be  set  to  either 
U(i)-1  or  LU ( U ( i ) -1 ) .  Both  these  quantities  are  already 
known.  So,  the  computation  of  AFTER  for  operators  takes 


0(1)  additional  time. 
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The  updating  of  AFTER  (Figure  6)  requires  only  0(log  n) 
time  and  n  PEs.  The  formation  of  P  takes  0(1)  time  and  n 
PEs.  Hen^e,  using  n  PEs,  the  postfix  form  may  be  computed 
in  O(log  n)  time.  The  complexity  is  dominated  by  the  sort 
step.  Another  complexity  measure  worth  computing  is  the  EPU 
(effectiveness  of  processor  utilization).  This  is: 


EPU3® 


complexity  of  fastest  sequential  algorithm 
complexity  of  parallel  algorithm  *  no.  of  PEs 


=  0(- 


n 


2  ^ 
log  n*n 

-  0( — ~) 
log  n 


_J.  Parallel  Generation  of  the  Tree  Form 

As  mentioned  in  Section  1,  most  code  optimizers  work  on  the 
tree  form  of  an  expression.  This  tree  form  is  easily 
obtained  from  the  postfix  form  by  considering  the  algorithm 
to  evaluate  postfix  expressions.  This  algorithm  is  given  in 
Figure  12  (see  [6]  for  an  explanation). 

Define  the  degree  of  an  operator  to  be  the  number  of 
operands  it  needs.  Let  D(i)  be  the  degree  of  operator  P(i). 
So,  D(P(i))=l  if  P(i)  is  unary;  D(P(i))»2  if  P(i)  is  a 
binary  operator.  Define  W(i)  as  below: 


|  l-D(i) 


if  P(i)  is  an  operand 
otherwise 


j 


W(  i) 
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procedure  EVAL(P,n) 

/ / evaluate  the  postfix  expression  P(l:n)// 
declare  n,  P( 1 :n) , i, STACK 
initialize  STACK 
for  i  «—  1  to  n  do 
case 

: P { i )  is  an  operand  :  Put  P(i)  on  tne  STACK 
:else  :  remove  as  many  operands  from  the  stack  as 
needed  to  compute  P(i).  Evaluate  P(i)  with 
these  operands  and  put  the  result  on  the 
STACK 

endcase 
endfor 
end  EVAL 

Figure  12 

Mote  that  W(i)  gives  the  change  in  the  stack  neight 
when  procedure  EVAL  processes  P(i)  (an  operand  increases  the 
height  by  1  while  an  operator  reduces  it  by  d(i)-l).  The 
stack  height,  H(i),  following  the  processing  of  P(i)  is 
given  by: 

i 

(3.1)  H(i)=  1  tf(j) 

j=l 


Let  us  make  the  simplifying  assumption  that  we  have  no 
operator  of  degree  greater  than  2. 

The  tree  form  of  tne  expression  P(l:n)  consists  of  n 
nodes.  Each  node  has  three  fields:  LCH1LD, RCriXLO,  and  P. 
It  is  easy  to  see  that  LCHILD( i)=RCHILD( i)=0  for  every  i 
such  that  P(i)  is  an  operand.  Also,  if  P(i)  is  an  operator, 
then  RCriXjjO(  i )=i-l .  If  P(i)  is  a  unary  operator, 
LCHILD( i) .  This  leaves  us  with  the  task  of  determining 
LCrilLU(i)  when  P(i)  is  a  binary  operator.  It  is  not  too 
difficult  to  see  that  in  thi3  case,  LCdXLD(i)  is  tne  largest 
j,  j  <  i  such  that  H(j)=H(i). 

The  LCHILD  values  for  binary  operators  can  therefore  be 
obtained  by  first  computing  H(i)  as  given  by  (3.1).  This 
can  be  done  in  0(log  n)  time  using  either  n  or  n/log  n  PEs 
and  tne  partial  sums  algorithm  of  [4J.  Figure  13  shows  the 
postfix  form  of  our  example  of  Figure  4.  The  W  values  are 
given  in  row  2  and  the  H  values  in  row  3. 

Next  the  H(i)g  are  sorted  using  a  stable  sort  method. 
This  takes  O(log  n)  time  and  0(n)  PEs  [13 J.  Tnis  sort 
brings  a  parent  and  its  left  child  (if  the  parent  is  a 
binary  operator)  together.  So,  in  our  example  P(7)  and  P(9) 
are  brought  together.  So  also  are  P(6)  and  P(10);  P(5)  and 
P ( 1 1 ) ;  P(4)  and  P(12);  etc.  Hence  every  binary  operator  can 
now  easily  determine  its  left  child.  The  expression  tree 
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that  results  for  our  example  is  shown  in  Figure  14. 

The  additional  time  needed  to  obtain  the  tree  is  O(log 
n)  and  the  number  of  PEs  needed  is  n.  Using  the  postfix 
algorithm  of  Section  2, -the  tree  form  may  be  obtained  from 
the  infix  form  in  O(log  n)  time  using  n  PEs .  The  EPU  of  tne 
resulting  tree  form  algorithm  is  O(l/log  n). 


4.  Conclusions 


We  have  shown  that  it  is  possible  to  effectively  parallelize 
the  postfix  and  tree  form  algorithms.  Our  parallel  algo¬ 
rithms  run  in  O(log  n)  time  when  n  PEs  are  available.  If 
only  n/k  PEs  are  available,  our  algorithms  can  still  be 
used.  The  complexity  will  be  0(k  log^n). 

The  results  of  this  paper  nicely  complement  the  work 
reported  on  the  parallel  evaluation  of  expressions  (see  [1], 
C2J,  C8J,  CiU],  and  [llj). 
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